70 research outputs found

    Trajectory Representation and Landmark Projection for Continuous-Time Structure from Motion

    Full text link
    This paper revisits the problem of continuous-time structure from motion, and introduces a number of extensions that improve convergence and efficiency. The formulation with a C2\mathcal{C}^2-continuous spline for the trajectory naturally incorporates inertial measurements, as derivatives of the sought trajectory. We analyse the behaviour of split interpolation on SO(3)\mathbb{SO}(3) and on R3\mathbb{R}^3, and a joint interpolation on SE(3)\mathbb{SE}(3), and show that the latter implicitly couples the direction of translation and rotation. Such an assumption can make good sense for a camera mounted on a robot arm, but not for hand-held or body-mounted cameras. Our experiments show that split interpolation on SO(3)\mathbb{SO}(3) and on R3\mathbb{R}^3 is preferable over SE(3)\mathbb{SE}(3) interpolation in all tested cases. Finally, we investigate the problem of landmark reprojection on rolling shutter cameras, and show that the tested reprojection methods give similar quality, while their computational load varies by a factor of 2.Comment: Submitted to IJR

    Registration Loss Learning for Deep Probabilistic Point Set Registration

    Full text link
    Probabilistic methods for point set registration have interesting theoretical properties, such as linear complexity in the number of used points, and they easily generalize to joint registration of multiple point sets. In this work, we improve their recognition performance to match state of the art. This is done by incorporating learned features, by adding a von Mises-Fisher feature model in each mixture component, and by using learned attention weights. We learn these jointly using a registration loss learning strategy (RLL) that directly uses the registration error as a loss, by back-propagating through the registration iterations. This is possible as the probabilistic registration is fully differentiable, and the result is a learning framework that is truly end-to-end. We perform extensive experiments on the 3DMatch and Kitti datasets. The experiments demonstrate that our approach benefits significantly from the integration of the learned features and our learning strategy, outperforming the state-of-the-art on Kitti. Code is available at https://github.com/felja633/RLLReg.Comment: 3DV 202

    Camera Calibration without Camera Access -- A Robust Validation Technique for Extended PnP Methods

    Full text link
    A challenge in image based metrology and forensics is intrinsic camera calibration when the used camera is unavailable. The unavailability raises two questions. The first question is how to find the projection model that describes the camera, and the second is to detect incorrect models. In this work, we use off-the-shelf extended PnP-methods to find the model from 2D-3D correspondences, and propose a method for model validation. The most common strategy for evaluating a projection model is comparing different models' residual variances - however, this naive strategy cannot distinguish whether the projection model is potentially underfitted or overfitted. To this end, we model the residual errors for each correspondence, individually scale all residuals using a predicted variance and test if the new residuals are drawn from a standard normal distribution. We demonstrate the effectiveness of our proposed validation in experiments on synthetic data, simulating 2D detection and Lidar measurements. Additionally, we provide experiments using data from an actual scene and compare non-camera access and camera access calibrations. Last, we use our method to validate annotations in MegaDepth

    Structure and motion estimation from rolling shutter video

    Full text link
    The majority of consumer quality cameras sold today have CMOS sensors with rolling shutters. In a rolling shutter camera, images are read out row by row, and thus each row is exposed during a different time interval. A rolling-shutter exposure causes geometric image distortions when either the camera or the scene is moving, and this causes state-of-the-art structure and motion algorithms to fail. We demonstrate a novel method for solving the structure and motion problem for rolling-shutter video. The method relies on exploiting the continuity of the camera motion, both between frames, and across a frame. We demonstrate the effectiveness of our method by controlled experiments on real video sequences. We show, both visually and quantitatively, that our method outperforms standard structure and motion, and is more accurate and efficient than a two-step approach, doing image rectification and structure and motion

    Модуль контроля концентрации аммиака в воздухе производственных помещений

    Get PDF
    Fourier descriptors (FDs) is a classical but still popular method for contour matching. The key idea is to apply the Fourier transform to a periodic representation of the contour, which results in a shape descriptor in the frequency domain. Fourier descriptors have mostly been used to compare object silhouettes and object contours; we instead use this well established machinery to describe local regions to be used in an object recognition framework. We extract local regions using the Maximally Stable Extremal Regions (MSER) detector and represent the external contour by FDs. Many approaches to matching FDs are based on the magnitude of each FD component, thus ignoring the information contained in the phase. Keeping the phase information requires us to take into account the global rotation of the contour and shifting of the contour samples. We show that the sum-of-squared differences of FDs can be computed without explicitly de-rotating the contours. We compare our correlation based matching against affine-invariant Fourier descriptors (AFDs) and demonstrate that our correlation based approach outperforms AFDs on real world data.©2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.Fredrik Larsson, Michael Felsberg and Per-Erik Forssén, Patch Contour Matching by Correlating Fourier Descriptors, 2009, Digital Image Computing: Techniquesand Applications (DICTA), Melbourne, Australia, December 2009. IEEE Computer Society

    Person re-identification via efficient inference in fully connected CRF

    Full text link
    In this paper, we address the problem of person re-identification problem, i.e., retrieving instances from gallery which are generated by the same person as the given probe image. This is very challenging because the person's appearance usually undergoes significant variations due to changes in illumination, camera angle and view, background clutter, and occlusion over the camera network. In this paper, we assume that the matched gallery images should not only be similar to the probe, but also be similar to each other, under suitable metric. We express this assumption with a fully connected CRF model in which each node corresponds to a gallery and every pair of nodes are connected by an edge. A label variable is associated with each node to indicate whether the corresponding image is from target person. We define unary potential for each node using existing feature calculation and matching techniques, which reflect the similarity between probe and gallery image, and define pairwise potential for each edge in terms of a weighed combination of Gaussian kernels, which encode appearance similarity between pair of gallery images. The specific form of pairwise potential allows us to exploit an efficient inference algorithm to calculate the marginal distribution of each label variable for this dense connected CRF. We show the superiority of our method by applying it to public datasets and comparing with the state of the art.Comment: 7 pages, 4 figure

    GMSF: Global Matching Scene Flow

    Full text link
    We tackle the task of scene flow estimation from point clouds. Given a source and a target point cloud, the objective is to estimate a translation from each point in the source point cloud to the target, resulting in a 3D motion vector field. Previous dominant scene flow estimation methods require complicated coarse-to-fine or recurrent architectures as a multi-stage refinement. In contrast, we propose a significantly simpler single-scale one-shot global matching to address the problem. Our key finding is that reliable feature similarity between point pairs is essential and sufficient to estimate accurate scene flow. To this end, we propose to decompose the feature extraction step via a hybrid local-global-cross transformer architecture which is crucial to accurate and robust feature representations. Extensive experiments show that GMSF sets a new state-of-the-art on multiple scene flow estimation benchmarks. On FlyingThings3D, with the presence of occlusion points, GMSF reduces the outlier percentage from the previous best performance of 27.4% to 11.7%. On KITTI Scene Flow, without any fine-tuning, our proposed method shows state-of-the-art performance

    Stabilizing cell phone video using inertial measurement sensors. In:

    Get PDF
    Abstract We present a system that rectifies and stabilizes video sequences on mobile devices with rolling-shutter cameras. The system corrects for rolling-shutter distortions using measurements from accelerometer and gyroscope sensors, and a 3D rotational distortion model. In order to obtain a stabilized video, and at the same time keep most content in view, we propose an adaptive low-pass filter algorithm to obtain the output camera trajectory. The accuracy of the orientation estimates has been evaluated experimentally using ground truth data from a motion capture system. We have conducted a user study, where the output from our system, implemented in iOS, has been compared to that of three other applications, as well as to the uncorrected video. The study shows that users prefer our sensor-based system
    corecore